Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[POC] Add a new Strict parser #164

Closed
wants to merge 1 commit into from
Closed

Conversation

antstorm
Copy link
Contributor

⚠️ Not yet ready to be merged ⚠️

Opening this PR to start a discussion on the current approach to parsing.

Over time I have worked/consulted for a few companies that use this gem internally. However all of them had to monkey-patch the gem in order to significantly restrict string parsing (especially when it comes to user input and even developer input).

The main issues being:

  1. Some wrong inputs get treated as amounts (e.g. 29.01.2023, 29,12,12, etc)
  2. Extra symbols are simply ignored (instead of suggesting incorrect input, e.g. one1two2three3)
  3. Loose thousands separator and decimal mark matching (10,000 and 10,00 yield different results)

I think that the current approach is "find any signs of monetary amount in the input". While this works for some use-cases, I would argue that this might not be the best default behaviour. For any company providing financial services or even consumer retail this unfortunately is straight unacceptable.

Therefore I'm proposing adding an optional strict parser (and potentially making it default). This parser takes a different approach — it finds known good partial matches (currency, symbol, amount, etc) and then checks it against allowed formats (e.g. [:symbol, :amount] for $99.99). This allows gem users to ensure that only allowed formats are matched and no unwanted characters were present.

The PR is a proof of concept and passes most specs with expected exceptions related specifically to:

  • Matching €10,000 as 10K, instead of 10 (should be tweak-able)
  • Not matching £.45B as this is a rarely accepted format
  • Returning nil when parsing nil (happy to change this to Money.empty)
  • Not matching 19.12.89 as this doesn't look like an amount

This is a big(ish) change, so I'd like to gather more input from others before I make further progress.

Any questions, concerns and feedback is very welcome! 🙌

@semmons99
Copy link
Member

I like the concept. You are correct this would be a huge, potentially breaking change. I would see the rollout as something like

  1. release version 1.13.0 with the new strict parser method anyone can use
  2. gather any feedback
  3. release version 1.14.0 which allows for the default parser to be changed to strict parsing
  4. gather any feedback
  5. release version 1.15.0 which warns the default parser will be changing in 2.0.0
  6. release version 2.0.0.beta1 which changes the default and warns that it has changed the default parser
  7. release version 2.0.0 which has a post install dialogue starting the change to parsing

I could also see a world where 2.0.0 requires the parser be explicitly set so that no one is caught off guard. Then in 2.1.0 (or even 3.0.0) we set the default to the new strict parser.

@santiagodoldan
Copy link

Are there any plans on moving forward with this strict parser?

@semmons99
Copy link
Member

I have not heard a strong outcry for—or against—moving forward. Given that, the default is to keep the status quo.

@allolex
Copy link
Contributor

allolex commented Feb 28, 2024

At the risk of being a Johnny-come-lately, I like this idea, but I also feel the "strict" parsing has its own burden of opinions.

I feel like a more flexible approach might be better.

Here are two ideas:

First idea: Support custom parsers and document strict parsing

  1. Refactor using the code here to allow injecting a custom parser.
  2. Once the codebase supports custom parsers, document this "strict" parser as an option, but not as the default.
  3. Release the strict parsing as a new major version if we get consensus for that.

Second idea: Support granular configuration

  1. Refactor the existing parser as configuration, maybe in an initializer block.
  2. Express the "strict" parsing rules as configuration.
  3. Document how to turn on strict parsing via configuration.

Tests would need to be present and pass in either case.

What do you all think?

@semmons99
Copy link
Member

I think both are good ideas. However, they are both a lot of work that someone needs to feel passionate enough to undertake.

@semmons99
Copy link
Member

been a year. closing.

@antstorm
Copy link
Contributor Author

@allolex sorry for a very long delay here, but I've finally managed to make some progress on it — #183. I'm using the approach you've suggested and making parsers injectable. It adds a bit more complexity, but given how potentially breaking this change might be it's worth keeping in place.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants